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Abstract 

We discuss a method for tracking individual molecules which globally optimizes the 
likelihood of the connections between molecule positions fast and with high reliability even 
for high spot densities and blinking molecules. Our method uses cost functions which can 
be freely chosen to combine costs for distances between spots in space and time and 
which can account for the reliability of positioning a molecule. To this end, we describe 
a top-down polyhedral approach to the problem of tracking many individual molecules. 
This immediately yields an effective implementation using standard linear programming 
solvers. Furthermore, we derive an unbiased estimator for diffusion coefficients of blinking 
molecules, which proves well in our experimental evaluation of our approach. 

1 Introduction 

The possibility to observe single fluorescent molecules in real-time has opened up a lot of new 
insights into the dynamics of systems in biology and material sciences. Single molecule mi- 
croscopy (SMM) allows for the parallel observation of translational and rotational motion of 
many single fluorescent molecules beyond the diffraction limit provided that their concentra- 
tion is reasonably low. However, tracking of single fluorescent molecules bears the challenge 
that the fluorescent spots are rather weak with a low signal/noise-ratio and show significant 
changes in signal intensity [H El [3] . In the extreme case, a fluorescent molecule is dark for 
several recorded frames, a phenomenon which is termed blinking |H[5]. 

The fact that the fluorescence signals of single molecules cannot be classified according 
to their intensity or shape in different frames and even disappear in some frames causes 
severe problems for single molecule tracking. Thus, many tracking algorithms which have 
been developed for single particle tracking (e.g. for cells) in video microscopy fail for tracking 
single molecules. 

The way from the recorded single molecule microscopy movies to the results about their 
motion includes typically the following steps: 

(i) determination of the positions of each fluorescent molecule, 

(ii) connecting the positions to single molecule tracks, and 
(hi) statistical analysis of these tracks. 



1.1 Previous Work 



Nowadays, single molecule positions are most often determined using center of mass or Gaus- 
sian fits, with the latter being the best option for low signal-to- noise ratios of around four [6j. 
Usually the images are preprocessed by various filters, e.g. Mexican hat [7J, before the actual 
localization. 

After localization, the positions of subsequent frames have to be connected to tracks 
|8j. Different approaches have been developed for this purpose, but up to now it remains 
challenging to improve and develop algorithms not only for special tasks but for a universal 
set of problems [9] . 

Single Particle Tracking procedures started with the connection of one point with its 
closest neighbor in consecutive frames [10]. In 1999, Chetverikov et al. published a new 
algorithm called IPAN Tracker. |llj Using a competitive linking process that develops as the 
trajectories grow, this algorithm deals better with incomplete trajectories, high spot densities, 
faster moving particles and appearing and disappearing spots. Sbalzarini et al. used the same 
approach as the IPAN tracker, but did not make any assumptions about the smoothness of 
trajectories [12] . Their algorithm was implemented as ParticleTracker in ImageJ. 

The SpotTracker [7j is a very powerful tool to follow single spots throughout one movie, 
but it can only proceed spot by spot. The algorithm proposed by Bonneau et al. [13] falls in 
the same category of greedy algorithms that iteratively compute shortest paths in space-time, 
which are not revised subsequently. 

One of the most accurate solutions to single particle tracking is provided by multiple- 
hypothesis tracking (MHT). This method chooses the largest non-conflicting ensemble of 
single particle paths simultaneously accounting for all position in each frame. Jaqaman et 
al. used such an approach where they first linked positions in consecutive frames by solving 
bipartite matching problems and combined these links into entire trajectories [14] with a 
post-processing step to account for missing points in a frame. Both steps were optimized 
independently which yielded a very likely solution of the tracking problem. Dynamic multiple- 
target tracing was used by Serge et al. to generate dynamic maps of tracked molecules at high 
density. [15] Subtracting detected peaks from the images allows them for a detection of low 
intensity peaks which would be otherwise hidden in movies of high particle density. Peak 
positions were connected using statistical information from past trajectories. 

Moreover, manual or semi-automated approaches, in which only unambiguous choices 
are performed automatically, are still used though there are cumbersome due to many user 
interaction at high particle densities. 

For the analysis of the tracks, different approaches have been developed |16l 19], The most 
common approach is analysis of the mean square displacement for different time intervals 
|17l [TE[ [T9l [20] which can readily distinguish between different modes of motion such as nor- 
mal diffusion, anomalous diffusion, confined diffusion, drift and active transport [21] . Alterna- 
tively, probability distribution of squared displacements [22J and radii of gyration [2'6\ [2"3] [TO] 
can be used to analyze single molecule tracks. 

1.2 Our Contribution 

We present a method for single molecule tracking which globally optimizes the likelihood 
of the connections between molecule positions fast and with high reliability even for high 
spot densities. Our method uses cost functions which can be freely chosen to combine costs 
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for distances between spots in space and time and which can account for the reliability 
of positioning a molecule. Using a suitable positioning procedure, reliable tracking can be 
performed even for highly mobile, frequently blinking and low intensity fluorescent molecules, 
cases for which most other tracking algorithms fail. 

In the following, we present a top-down approach for modeling molecule tracking. We 
unify the previous approaches in one framework. By developing a suitable polyhedral model, 
we show theoretically that it remains computationally tractable. A major advantage of our 
method is that we immediately obtain an effective software solution using standard linear 
programming software. Moreover, we experimentally evaluate our implementation. To this 
end, we use real-world data and realistic data, i.e. randomly generated according to a physical 
model. We qualitatively compare our tracking on the real-world data to tracks obtained by 
a human expert, whereas we exploit the knowledge about the ground-truth in the realistic 
data to quantitatively measure the impact of noise and the validity of the parameters that 
we have chosen. We evaluated our approach for two-dimensional tracking, but the extension 
to 3D is straight-forward. We provide our software as open source code for MATLAB using 
CPLEX as LP-solver (see Appendix). 

2 A Polyhedral Model for Molecule Tracking 

It is easy to see that the number of possible trajectories grows exponentially with the number 
of points. To tackle this combinatorial explosion [13], we consider the a top-down polyhedral 
approach for a concise representation in this paper. Suppose we are given a set of points 
V = {v\, . . . ,v n }. Each point has one temporal and d spatial coordinates, say (ti,Xi,yi), 
i G V, with d = 2 as used in the following for the sake of presentation. Before we further 
dwell on formalism, we first formulate the following conditions for a track: 

• Each point has at most one predecessor. 

• Each point has at most one successor. 

We model the predecessor/successor relation of two points by ordered pairs. To this end, 
let Vt<tj '■= {vi £ V : ti < tj} and A := {(vi,Vj) : Vi G Vt<t i , Vj G V}. We denote the 
predecessor /successor relation by / : A — > {0,1}. Moreover, let g, h : V — > {0,1} denote 
missing predecessors and successors, respectively. That is, g(v) = 1 iff v G V does not have a 
predecessor, and h(v) = 1 iff it does not have a successor. Let 5 in (v) , 5 out (v) C A denote the 
sets of possible predecessor /successor relations for point v. 

Definition 1. A track partition or tracking of V is a collection of disjoint tracks covering 
V, i.e. each point appears in exactly one track, which might consist of a single point. 

The characteristic vector (/, g, h) of a track partition is a {0, l}-vector in which the first 
\A\ entries corresponding to / denote the predecessors/successor relation, the following \V\ 
entries corresponding to g determine the starting points of the tracks, and the last \ V\ entries 
corresponding to h define the endpoints of the tracks. 

Theorem 2. The tracking polytope, i.e. the convex hull of all track partitions, is described 
by 

P ■= {(/, 9, h) G [0, 1] W+W : G V : g(v) + /(<$*») = 1, h(v) + f(5 out (v)) = 1}. 
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Proof. It is easy to see that the characteristic vector of a tracking is contained in P. Moreover, 
each {0, l}-vector in P corresponds to a tracking. Hence, it remains to show that these are 
the only vertices of P. To this end, we prove that the constraint matrix M that defines P in 
the form P = {x : Mx = l,x > 0} is totally unimodular. First, we observe that M = (M'l) 
where M' corresponds to the /-variables. Hence, it suffices to show total unimodularity for 
M' . To this end, we consider an auxiliary graph G' = (V1IJV2, E') at which each of V\ t 2 
contains a copy of each point in V and E' is the set of edges that mimics the set A on V\ x Vi- 
Note that by this definition G' is bipartite. Moreover, its adjacency matrix is given by M'. 
Hence, M' and M are totally unimodular. □ 



2.1 Optimization 

Based on the compact representation of all possible tracks as described before, we now consider 
the problem of selecting an appropriate tracking out of all these possibilities. To this end, we 
leverage the fundamental paradigm of normal diffusion: Tracks are Markov chains, i.e. the 
transition probability from one state to another does only depend on the current state and 
not on the history that led to it. Thus, all transitions are independent random events. 

Suppose that we are given the probability distributions p\ : A — > [0, 1] for the transitions 
and p2,3 : V — > [0, 1] denoting the probability that a point is the beginning or the end of a 
track, respectively. We wish to find a tracking with maximum likelihood. That is, a tracking 
that maximizes the joint probability of the independent random events 

c(f, 9 ,h)= n pi (a) n n ^ 

aeA vev vev 

f(a)=l g(v)=l h(v)=l 

We note that our approach is universal w.r.t. to the chosen probability distributions. For 
the sake of presentation, we consider a normal diffusion model in this paper. That is, the 
probability that after time t a particle is translated by a vector r = (r x ,r y ) is proportional to 

exp [—wT ) 

where D G M is called diffusion coefficient. 

The joint probability distribution that N particles have moved by r(l), . . . ,r(N) G M 2 
after one time-step of length t is hence given by the product 

Plugging this into the joint probability distribution, we obtain 

„2(„\ 1 Jll 



£{f,9,h) = exp 



E rxia \ + D r t y[a) ■ /(a) + £ H P2 (v)) ■ g{y) + £ ln(p 3 (^)) ■ h(v) 

aeA v<=V v<=V 



Due to monotonicity of the exponential function, finding the most likely tracking amounts 
to solve the linear programming problem 

min I Yl c i («)/(«) " Yl c ^ v )9(v) ~ Y, c 3(v)h(v) : (/, g, h) G P \ 
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Note that this formulation is again general enough to capture arbitrary separable likelihood 
functions C(f,g,h) and it is not restricted to normal diffusion. 



Lemma 2.1. For all optimum solutions (/, g,h) G P and a = (v, w) G A, we have 

/(a) = 1 => dip) < c 2 (v) + c 3 (w). 
Proof. By contradiction: We would obtain a better feasible solution by setting /(a) = and 

g(y) = h(w) = 1. □ 

Although it is possible to consider different probabilities for appearing and vanishing 
particles, we choose a constant one, i.e. let 02(f) = c%{v) = C for all v G V. Hence, 
c(a) < 2 • C for all a G A which appear in any optimum solution. This inspires the definition 
of a tracking radius R such that we will only consider predecessors and successors within 
that range. This dramatically limits the size of an instance and enables us to use space 
partition techniques to efficiently construct the tracking LP. Put differently, the restriction 



to a certain tracking radius for efficiency reason is justified by Lem. 2.1 In the experiments 
section, we will discuss suitable choices for R. Similarly, it makes sense to limit the number 
of frames that a molecule might be invisible. 



2.2 Dealing with Noise 

Since the points are usually detected from noisy images, there are false-positive and false- 
negatives. That is, a spot v G V is a false-positive, if it does not correspond to any track. A 
false-negative is a point that does not have a correspondent in V. 

Moreover, there might be v,w G V that correspond to the same point. To deal with 
these duplicates, we introduce so-called joins into A. That is, we allow that two points from 
the same frame appear in one track. We thereby maintain the integrality of our polyhedron. 
However, we treat these joins differently w.r.t. the objective function to reflect the special 
situation. We propose to set the cost of such a link to the ordinary cost of that connection 
plus the mean of the penalties for not having a successor and a predecessor, respectively. 
Taking the penalty into account is necessary to avoid 2-cycles. 

Since we can only deal with points that are present in V, we shall avoid false- negatives 
in the point detection. However, a low false-negative rate often leads to a high false-positive 
rate. Therefore, we utilize the possibility to consider a quality measure of each detected 
point. That is, we reduce the cost of not tracking a point according to its quality. This can 
be modeled easily by multiplying 02,3 (v) by some quality factor q(v), e.g. proportional to the 
strength of the signal of this spot. 



3 Determination of diffusion coefficients from track positions 

A popular method to determine the diffusion coefficients from single molecule tracks is called 
track radius analysis (also known as radius of gyration analysis) [TBI I2"%1 123] , which uses the 
eigenvalues of the sample covariance matrix Q. The advantage of this method lies in the high 
accuracy of the obtained diffusion coefficients which are robust even for small trajectories. 
Whereas previously the relation was only verified by Monte Carlo random walk simulations 
of normal diffusion [25], we now give a theoretical explanation, which is in agreement with 
the experimental findings. 
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We first observe that the original definition of the track radius Rt as the sum of the 
eigenvalues of Q is equivalent to the trace of Q, i.e. R T := Ylt=i ^k(Q) = tr(Q). Hence, 
the track radius is only determined by the variances of the individual coordinates and it is 
independent of the covariances between the coordinates. As mentioned before, Rt is a biased 
estimator for the diffusion coefficient D. That is, under certain fixed conditions, e.g. no 
blinking, fixed track lengths, the diffusion coefficient is proportional to Rt- 

In the following, we show how to obtain an unbiased estimator for the diffusion coefficient 
in a normal diffusion model, which also accounts for blinking and variable track lengths. That 
is, we consider position updates of the form 

Xi = Xi-i + Si • rii for i > 1, 

where Hi is a standard normal random variable (i.e. /x = 0, a = 1) and the step-lengths Sj 
are dependent on the diffusion coefficient and the time between the observations. Hence, the 
coordinates Gaussian random variables with 

i 

E[xi] = xi + ^2 s k E[n k ] = x 1 

k=2 

i j i j min{ij} 

E[xjXj] = x\ + xi ^ s k E[n k ] + x\ ^ s k E[n k ] + ^ ^ s k s e E[n k ne] =x\+ ^ s k 

k=2 k=2 k=2 1=2 k=2 

In the case of normal diffusion, we have s\ = 2D{t k — t k ^i), where t k denotes the time of the 
kth observation. Thus, we may write 

E[xiXj] = x\ + 2D(t mm{iJ} -ti). 

We use the unbiased estimator for the variance of the x-coordinates of a track xi,... ,xn, 
which is determined as 



I N f i N \ 2 I N 1 N N 

i=\ \ k=l / i=l v ' i=l 1=1 



0C 2 0C j 



In expectation, we get 



N N N 

(N-l). E[a 2 x } = £ E[x*} E[xi Xj } 

i=l i=l j=l 

N 2D N N 

=Nx\ + 2D J2(U ~ h) -Nxl-—J2 Y.^UJ} ~ *i) 

1=1 j=l jr' = l 

2D N 

i=2 

Because the random displacements of the x- and y-coordinates are independent identically 
distributed, we obtain the same for E[a y ]. This motivates the definition of 

D _ = N(N-1) al + al = N(N-l) r2 

4 E^a-i-AOfe-ti) 4J2l 2 (2i - i - N)( ti - h) T ' 
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which is an unbiased estimator for the diffusion coefficient in d dimensions because E[D] = D. 
For the special case of uniform time-steps, we substitute U = i ■ r and obtain 



= 3_ a 2 + a 2 = 3 2 
' 2r 7V + 1 2(AT + 1)t 



Note the we defined a and Rt as the unbiased estimators of the variance and via the sample 
covariance matrix, respectively. If the maximum likelihood estimators, i.e. normalization by 
l/N instead of l/(N — I), are used, then this is simply accounted by an addition factor of 
N/(N - 1). 

Since the localization of the centers of the spots is subject to some uncertainty, we extend 
our model as follows. We define Xi := Xi + £j, where £j ~ Af(0, e 2 ) are independent normally 
distributed random variables with variance e 2 . Hence, we obtain 

E[xi] = E[ Xl ] 

E[xiXj] = E[xiXj] + E[ei] ■ E[xj] + E[xi\ ■ E[ej] + E[eiEj] = E[xiXj] + e 2 5 ij 

with Kronecker's 5a = 1 and <5jj = for i ^ j. Since we can estimate the track variances only 
based on the measured data Xj, we get in expectation 

N „ N N 



(N-l). E[a 2 x ] = ^ E[3fi " ^ E E E ^ 

i=l i=l j=l 

N N N 

=E^k 2 ] + ^ 2 -^EE^]- £2 



N 

i=l i=l j=l 

2D N 

=(N-l)e 2 + — J2&-l-N)(ti-h). 

Hence, we now have a bias in the estimator D as defined above depending on e: 

E\b ] = D + , N{N - l)el , 

2Eil 2 (M-i -mu -*i) 

Thus, uniform time-steps yield 

Put differently, the longer the tracks the less influential are the localization errors. Note that 
this derivation also holds for any independent identically distributed localization errors with 
mean and variance e 2 . 



4 Experiments 

So far, we described a generic approach for tracking blinking molecules. In this section, we 
propose our choice for the cost-function, i.e. c(vi,Vj) = (x^ — Xj) 2 + (yi — yj) 2 + (U — tj) 2 , 
which is validated experimentally. 
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Figure 1: Tracking with Ax 2 + Ay 2 on the left and with Ax 2 + Ay 2 + At 2 on the right. Two 
interleaved tracks are produced for one molecule on the left, whereas there is exactly one on 
the right. 

The rationale for using this function is based on the following observation: if time is not 
penalized, then track fragmentation becomes more likely as shown in Fig. [TJ 

Thus, we introduced the superlinear term At 2 such that two time steps of length 1 are 
cheaper than one time step of length 2. Hence, the spatial distance is mainly responsible for 
comparing positions within the same frame. We consider closer destinations to be more likely. 
Therefore, we do not use the time in this part of the objective. 

In this section, we also briefly discuss the dependencies of the results on parameters such 
as the tracking radius R (a more in-depth analysis will be published elsewhere [26]). To this 
end, we first consider a controlled testing environment based on the normal diffusion model 
mentioned above. We thereby obtain realistic randomly generated instances. Though the 
experiments with the simulated realistic data has the advantage that we know the ground 
truth and thus we can quantify the deviation of the computed results in certain situations, 
we shall also validate our approach on real-world instances. To this end, we compare the 
diffusion coefficients obtained manually by a human expert with our automated approach in 
the final subsection of this paper. 

4.1 Realistic Data 




Figure 2: Different signal-to-noise ratios (1,2,3,4, and 5 from left to right). 



Synthetic trajectories were generated to test our localization and tracking algorithms. 
Random walk simulations with 500 steps were performed using four different diffusion coeffi- 
cients (10~ 15 , 10~ 14 , 10 -13 , and 10 -12 m 2 s _1 ), five different signal-to-noise ratios (1, 2, 3, 4, 
and 5, see Fig. [2]) and five different particle densities (100, 200, 300, 400 and 500 particles per 
frame). These values can be found in many tracking challenges of practical importance, but 
were also chosen to determine the sensitivity of our algorithms. The initial x- and y-position 
of each spot was chosen uniformly at random in the range [0;500]. The single molecule trajec- 
tories were used to construct movies with 500 frames. Two dimensional Gaussian functions 
were constructed with their center located at the positions obtained from the random walk 



simulations and their widths being diffraction limited (ca. 300 nm). The time between con- 
secutive frames was chosen as 0.1 s and the resolution as 100 nm per pixel. Gaussian white 
noise was added to the movies corresponding to the signal-to-noise ratio defined as with the 
amplitude of the center of the 2D Gaussian and the standard deviation of the Gaussian white 
noise. 

4.1.1 Determination of tracking accuracy for simulated data 

Tracking procedures have to meet several conditions to be suitable. Apart from practical 
aspects such as tracking speed and memory consumption, the number of false-positives is the 
key factor which has to be minimized in order to obtain reliable results. Similarly to the 
definition of false-positive and false- negative w.r.t. particle locations, false positives in this 
context are connections between positions in different frames which have been set even though 
the positions do not belong to the same molecule/particle. They can result in severe errors in 
single molecule tracking and cause wrong interpretations of collected data. Thus, the number 
of false positives should be kept as low as possible. False negatives are connections between 
positions in different frames which are not recognized by the tracking algorithm. 

The fraction of false positives and false negatives for movies of different diffusion coef- 
ficients and signal-to-noise-ratios was calculated comparing ground truth and analyzed con- 
nections between points. The fraction of false positive connections decreases from 13% to 3% 
as the S/N-ratio increases from 1 to 5. False negatives particularly occur with fast moving 
molecules if the tracking radius is not chosen carefully. The reason is as follows. Since the 
distribution of displacements can be expressed [27] as 



Thus, picking the tracking radius too low yields biased false-negatives and hence an underes- 
timation of the diffusion coefficients. However, the tempting choice of an excessive tracking 
radius does not only require much more computational resources, but may also lead to an over- 
estimation of the diffusion coefficients if there occur leaps in the tracks due to false-positives 
(in particular with high particle densities). We propose to choose the tracking radius such 
that a displacement is smaller with a probability of about 99%. Since those leaps are easily 
determined in a post-processing steps, a repetition of the tracking with different radii in a 
feedback loop is feasible. 

Positioning and tracking are the key steps in the determination of diffusion coefficients. 
To distinguish the errors appearing in these steps, we analyzed four different cases shown in 



In case (1), the distribution of diffusion coefficient was directly calculated from the ground 
truth tracks. Though all tracks were created with respect to fixed diffusion coefficients, the 
calculation yields peaked distributions around the true values because of the finite number of 
sample points in each track. 

In a second set of analysis (case 2) the same ground truth positions were tracked using our 
polyhedral model solved with CPLEX. Good results were obtained except for fast molecules 




the probability of finding the destination within a radius of R is 




Fig. § 
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Figure 3: Steps taken from the simulated ground truth data to the analysis of the trajectories. 
For analysis of the accuracy of each single step different paths were considered which are 
labeled 1 to 4. 



and high densities. That is, the probability of foreign spots moving into the tracking range 
of another molecule is too high, and thus the tracking algorithm in general returns diffusion 
coefficients lower than the real value. 

The third set of analysis, case 3, allows for an investigation of the influence of positioning 
inaccuracies on the distribution of diffusion coefficients. Movies were constructed from the 
ground truth positions with different S/N-ratios. Our positioning algorithm was applied to 
these movies and, where possible, the positions matched to the positions of the ground truth 
tracks. With the determined positions of each track, a diffusion coefficient was obtained. The 
distribution of diffusion coefficients of these tracks resembles the distributions of the ground 
truth tracks with the exception of low diffusion coefficients with low S/N-ratios where the 
poor localization accuracy results in a seemingly higher diffusion coefficient than simulated. 

Case 4 describes the procedure which is applied for real movies to determine single 
molecule diffusion coefficients. Spots in movies are positioned, the positions tracked and 
a diffusion coefficient calculated from these tracks. For high S/N-ratios the obtained distri- 
butions are similar to the simulated distributions. Two trends can be observed in particular 
at low S/N- ratios: 

(i) molecules with low diffusion coefficients tend to be analyzed as being faster as they were 
simulated, and 

(ii) analysis of the motion of fast molecules in average results in a lower diffusion coefficient 
as the ground truth data. 

The former observation can be explained by the poorer localization accuracy at low S/N-ratios. 
This inaccuracy resembles diffusion and thus a low diffusion coefficient will be assigned even 
to immobile molecules. The localization accuracy determines the lowest diffusion coefficient 
which can be determined by the corresponding experimental settings. 

4.1.2 Tracking of blinking data 

In the previous subsection, points were missing due to false negatives in the localization at 
low S/N-ratios. However, even with perfect localization missing points may occur naturally 
in real-world experiments because the fluorescence intensity of single molecules is typically 
not constant, but shows blinking behavior due to photochemical or photophysical quenching 
processes p] such as intersystem crossing to the triplet state [281 123 GO] or electron transfer 
[31] . The lengths of on- and off-times typically show a power law distribution pQ . In order to 
simulate blinking behavior, we generated on- and off-times for our simulated tracks using the 
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following procedure. At the beginning a molecule was u.a.r. set as on or off. The number of 
frames remaining in this state was determined randomly from a probability distribution with 
P(t) ~ (i/r) a . We chose realistic values for r and a, i.e. r = Is and a = —2, respectively. 




-14 -13 
log (□ / mV 1 ) 




-14 -13 
log (D / mV 1 ) 



Figure 4: Distributions of diffusion coefficients obtained after tracking of blinking data for a 
maximum step length of 5 (left) and 15 (right). 

The analyzed distributions of diffusion coefficients for tracking of blinking ground truth 
data are shown in Fig. |4| For a maximum step length of 5 pixels, the distributions of dif- 
fusion coefficients resemble the ground truth data except for faster molecules with diffusion 
coefficients above 10 -13 m 2 s -1 where a step length of 5 pixels is not sufficient for tracking 
as discussed before. Tracking with a maximum step length of 15 pixels gives good results 
for fast diffusing molecules, but for simulated diffusion coefficients of 10 -14 m 2 s _1 and below 
results a long tail or even a second band at higher D values (see Fig. [4] (right)). The deviation 
of distribution from the ground truth distribution is caused by tracks which include at least 
one large jump from one molecule to another one which results in a significant increase of the 
track radius. 



4.2 Real- World Data 

The real-world data were obtained from single molecule fluorescence widefield experiments 
during the bulk radical polymerization of styrene to polystyrene. The motion of single pery- 
lene diimide fluorophores was observed at various monomer-to-polymer-conversions and thus 
different viscosities which allowed us to probe a broad range of diffusion coefficients. The 
interested reader is referred to |25j . 

Before this project, the tracks were constructed semi-manually due to the lack of a satisfy- 
ing alternative. That is, only a simple search in the neighborhood of the points was performed 
automatically as long as there were no ambiguities, i.e. only one localized point within the 
tracking radius and no competition among potential predecessors. In the case when the auto- 
matic continuation of the tracks fails, the user was presented with 10 consecutive frames of the 
movie with the options to select a successor among the alternative, to introduce a new spot 
that was not detected by the localization, or to end the track. Needless to say that this was a 
tedious task, which took several working days to complete the tracking of a 5-minute-movie 
with high particle density. The advantage of this method is that the human expert maintains 
the full control over the process and the pattern recognition capabilities of the human brain 
is leveraged to resolve situations in which the image processing tools fail. On the other hand, 
these possibilities are also a disadvantage as the user might introduce systematic errors in the 
data and it is unlikely that a repetition of the task yields exactly the same results. 
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It remains to show that our automatic method not only works for realistic data but also 
in the real-world. To this end, we compare the average diffusion coefficients obtained from 6 
movies by manual and automatic tracking: 



manual 


0.019 


0.053 


0.126 


0.537 


1.166 


4.864 


•10~ 13 mV 1 


automatic 


0.023 


0.054 


0.132 


0.509 


1.054 


4.372 
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5 Appendix 



% Copyright (c) 2012, Andreas Karrenbauer (andreas .karrenbauer@uni-konstanz.de) 

% Copyright (c) 2012, Christoph KSlbl ( christoph . koelbl@uni— konstanz . de) 

% Copyright (c) 2012, Dominik W611 (dominik.woell@uni—konstanz.de) 

% All rights reserved. 

% Redistribution and use in source and binary forms, with or without 

% modification, are permitted provided that the following conditions are met: 

% * Redistributions of source code must retain the above copyright 

% notice, this list of conditions and the following disclaimer. 

% * Redistributions in binary form must reproduce the above copyright 

% notice, this list of conditions and the following disclaimer in the 

% documentation and/or other materials provided with the distribution. 

% * The name of the author may not be used to endorse or promote products 

% derived from this software without specific prior written permission. 

% THIS SOFTWARE IS PROVIDED BY THE AUTHOR ''AS IS'' AND ANY EXPRESS OR IMPLIED 

% WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 

% MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO 

% EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 

% SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 

% PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; 

% OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 

% WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR 

% OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF 

% ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 



function [track] =t racking (pos, R, T, mini en, cost fen, jo in radius, useQF, weight fen) 

%% input variables: 

% pos: Matrix with spot positioning data in the following structure 

% [frame number, x— position, y— position, spot quality (optional)] 

% example : 

% 1 234 .1 367 . 4 0.80 

% 1 345.2 122 .5 0.91 

% 2 237 . 3 365 .2 0.75 

% 2 340 .2 121 . 9 . 97 

% R: Maximum allowed step length for automatic tracking (in units of x— and 

% y— position) 

% T: Maximum number of frames between position belonging to the same track 

% minlen: Minimum length of a track (shorter tracks will be deleted directly) 

% costfen: cost function with spatial and temporal coordinates, 

% e.g. costf cn= ' x . "2+y . ~2+t . "2 ' . 

a 
o 

% joinradius: join radius; set to if not used 

% useQF : if weight function for spot quality is used 1, else 
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% weightfnc: function with quality variable 'q', e.g. 'q.~(l/2)' 
%% example : 

% [track] =t racking (pos, 5,5,3, ' x . ~ 2+y . ~ 2+t . " 2 ' , , , ' ' ) ; 

%% Matlab code 

costs = inline (costfcn, ' x ',' y ',' t ') ; 
pos = pos (pos ( : , 1 ) >0 , : ) ; 
n=size (pos , 1 ) ; 

Quality_f unction = inline (weightfcn, ' q ') ; 
if size (pos, 2)<4 || useQF==0 
p=costs (R, R, T) *ones (n, 1) ; 

else 

p=costs (R, R, T) *Quality_function (pos ( : , 4) ) ; 

end 

[arcs,m] = generate_arcs ( pos, R, T, joinradius ) ; 
disp('Arcs constructed'); 

El=speye (n) ; 

c = costs ( pos (arcs ( : , 2 ) , 2 ) — pos ( arcs ( : , 1 ) , 2 ) , . . . 

pos (arcs (:, 2), 3) —pos (arcs(:,l),3),... 

pos(arcs(:,2),l) —pos (arcs(:,l),l) ); 
links = pos (arcs ( : , 2 ) , 1 ) ==pos (arcs ( : , 1 ) , 1 ) ; 

c(links) = c(links) + . 5* (p (arcs ( links , 1 ) ) + p (arcs ( links , 2 ))) ; 
disp ( 'Matrix constructed'); 

cplex = Cplex (' Tracking 1 ) ; 
cplex .Model . sense = 'minimize'; 
Z=spalloc (n, n, n) ; 
disp (' Initializing CPLEX'); 

cplex . addCols ([ c ; p ; p] , [], [], [] ); 

cplex . addRows (ones (n, 1 ) , [El(:,arcs(:,2)) El Z ] , ones (n, 1 ) ) ; 
cplex . addRows (ones (n, 1 ) , [El(:,arcs(:,l)) Z El ] , ones (n, 1 ) ) ; 
disp ( ' Solving LP ' ) ; 

cplex . Par am. Ipmethod . Cur =4 ; 
cplex . solve ( ) ; 
cplex . Par am. Ipmethod . Cur =2 ; 
disp (' Processing solution'); 

% POSTPROCESSING 

successor=zeros (n, 1) ; 
first=zeros (n,2) ; 
len=zeros (n, 1) ; 
roots= [ ] ; 

usedarcs=arcs ( cplex. Sol ut ion. x(l:m)==l, : ) ; 
for a = 1 : length (usedarcs ) 

successor (usedarcs (a, 1) ) =usedarcs (a, 2) ; 
if first (usedarcs (a, 1 ), 1 ) ==0 

roots= [roots; usedarcs (a, 1 )] ; 

fir st (usedarcs ( a, 1 ), 1 ) = usedarcs (a, 1 ) ; 

end 

f = first (usedarcs (a, 1) , 1) ; 
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f irst (usedarcs (a, 2 ) , 1 ) = f; 
len(f) = len(f) + 1; 

end 

disp (' Storing tracks'); 

lr = len (roots ( : ) ) > minlen; 

longroots = sortrows ([ len ( root s ( lr) ) root s ( lr ) ] , — 1 ) ; 
f irst (longroots ( : , 2) , 2 ) =1 : size (longroots , 1) ; 
usedarcs (:, 4 ) = first (first (usedarcs (:, 1) , 1) , 2) ; 

usedspots = cplex . Solution . x (m+1 : m+n) +cplex . Solution . x (n+m+1 : 2 *n+m) <= 1; 
usedspots (usedspots ) = f irst ( f irst (usedspots, 1 ), 2 ) >0 ; 
track = [ pos (usedspots (:), 2 ) pos (usedspots (:), 3) ... 

pos (usedspots (:),1) first (first (usedspots (:),1), 2)]; 
track = track (track (:, 4) >0, :) ; 
track = sortrows (track, 4 ) ; 
disp (' Tracking finished'); 



function [ arcs , m] =generate_arcs ( spots, R, T, joinradius) 
arcs = zeros ( 10 , 3 ) ; 
m=0 ; 

n = size ( spot s , 1 ) ; 
minx = min ( spots(:,2) ); 
miny = min ( spots(:,3) ); 
maxx = max ( spots(:,2) ); 
maxy = max ( spots(:,3) ); 
width = maxx — minx; 
height = maxy — miny; 
fieldwidth = R; 

xfields = floor (width/f ieldwidth) +1; 
yfields = floor (height /fieldwidth) +1 ; 
mg=max ( spot s ( : , 1 ) ) ; 

candidate = zeros ( mg, xfields, yfields ) ; 
data = zeros ( mg, xfields, yfields, 1 ); 
coord = zeros (n, 4); 

for i = 1 : n 

t=spots ( i , 1 ) ; 

binx = f loor (( spot s ( i, 2) —minx) / fieldwidth) +1 ; 
biny = f loor (( spot s ( i, 3) —miny )/ fieldwidth) +1 ; 
candidate (t , binx, biny) =candidate (t , binx, biny ) +1 ; 
if candidate (t, binx, biny) > size (data, 4) 

data = cat (4, data, zeros (mg, xfields, yfields, size (data, 4 ) ) ) ; 

end 

data (spots (i, 1) , binx, biny, candidate (t , binx, biny )) =i ; 
coord (i, 1) =spots (i, 1) ; 
coord ( i , 2 ) =binx; 
coord(i,3)=biny; 

coord ( i , 4 ) =candidate (t , binx, biny) ; 

end 

data=data ( : , : , : , 1 :max (candidate (:))); 
for i=l : n 

list = data ( coord (i, 1 ), max ([ 1 coord (i, 2) — 1] ) :min ( [coord (i, 2 ) +1 xfields]),... 

max ( [ 1 coord (i, 3) — 1] ) :min ( [coord (i, 3) +1 yfields]),:); 
list = list ( list > & list "= i ) ; 
list = list ( (spots (i,2)-spots (list (:) ,2) ) ."2 +... 
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(spots (i, 3) — spots ( list ( : ) , 3) ) . "2 < joinradius ~ 2 ) 
1 = length (list ) ; 
if 1 > 

if m+1 > size (arcs, 1) 

arcs = [arcs; zeros (m+1 , 3 ) ] ; 

end 

arcs (m+ (1 : 1) ' , 1) = list; 
arcs (m+ ( 1 : 1 ) ' , 2 ) = i*ones(l,l); 
arcs (m+ (1:1) ' , 3) =m+(l:l)'; 
m = m + 1 ; 

end 

list = data( coord ( i, 1) -1 :-l :max ([ 1 coord (i, 1) -T] ),.. . 

max ( [ 1 coord (i, 2 ) — 1 ] ) :min ( [ coord ( i , 2 ) +1 xfields]) 
max ( [ 1 coord (i, 3) — 1 ] ) :min ( [ coord ( i , 3 ) +1 yfields]) 

list = list (list ( : ) >0) ; 

list = list ( ( spot s ( i, 2) — spot s ( list ( : ) , 2 ) ) . * 2 +... 

(spots (i, 3)-spots (list (:), 3) ). "2 < R~2 ); 
1 = length ( list ) ; 
if 1 > 

if m+1 > size (arcs, 1) 

arcs = [arcs; zeros (m+1 , 3 ) ] ; 

end 

arcs (m+ (1 : 1) ' , 1) = list; 
arcs (m+ ( 1 : 1 ) ' , 2 ) = i*ones(l,l); 
arcs (m+ (1 : 1) ' , 3) =m+(l:l)'; 
m = m + 1 ; 

end 

end 

arcs = arcs ( 1 : m, :); 
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